Goto

Collaborating Authors

 spine surgery


Diagnosing Hallucination Risk in AI Surgical Decision-Support: A Sequential Framework for Sequential Validation

Chen, Dong, Wei, Yanzhe, He, Zonglin, Kuang, Guan-Ming, Ye, Canhua, An, Meiru, Peng, Huili, Hu, Yong, Tao, Huiren, Cheung, Kenneth MC

arXiv.org Artificial Intelligence

Large language models (LLMs) offer transformative potential for clinical decision support in spine surgery but pose significant risks through hallucinations, which are factually inconsistent or contextually misaligned outputs that may compromise patient safety. This study introduces a clinician-centered framework to quantify hallucination risks by evaluating diagnostic precision, recommendation quality, reasoning robustness, output coherence, and knowledge alignment. We assessed six leading LLMs across 30 expert-validated spinal cases. DeepSeek-R1 demonstrated superior overall performance (total score: 86.03 $\pm$ 2.08), particularly in high-stakes domains such as trauma and infection. A critical finding reveals that reasoning-enhanced model variants did not uniformly outperform standard counterparts: Claude-3.7-Sonnet's extended thinking mode underperformed relative to its standard version (80.79 $\pm$ 1.83 vs. 81.56 $\pm$ 1.92), indicating extended chain-of-thought reasoning alone is insufficient for clinical reliability. Multidimensional stress-testing exposed model-specific vulnerabilities, with recommendation quality degrading by 7.4% under amplified complexity. This decline contrasted with marginal improvements in rationality (+2.0%), readability (+1.7%) and diagnosis (+4.7%), highlighting a concerning divergence between perceived coherence and actionable guidance. Our findings advocate integrating interpretability mechanisms (e.g., reasoning chain visualization) into clinical workflows and establish a safety-aware validation framework for surgical LLM deployment.


SurgeryLSTM: A Time-Aware Neural Model for Accurate and Explainable Length of Stay Prediction After Spine Surgery

Cho, Ha Na, Sutari, Sairam, Lopez, Alexander, Bow, Hansen, Zheng, Kai

arXiv.org Artificial Intelligence

Objective: To develop and evaluate machine learning (ML) models for predicting length of stay (LOS) in elective spine surgery, with a focus on the benefits of temporal modeling and model interpretability. Materials and Methods: We compared traditional ML models (e.g., linear regression, random forest, support vector machine (SVM), and XGBoost) with our developed model, SurgeryLSTM, a masked bidirectional long short-term memory (BiLSTM) with an attention, using structured perioperative electronic health records (EHR) data. Performance was evaluated using the coefficient of determination (R2), and key predictors were identified using explainable AI. Results: SurgeryLSTM achieved the highest predictive accuracy (R2=0.86), outperforming XGBoost (R2 = 0.85) and baseline models. The attention mechanism improved interpretability by dynamically identifying influential temporal segments within preoperative clinical sequences, allowing clinicians to trace which events or features most contributed to each LOS prediction. Key predictors of LOS included bone disorder, chronic kidney disease, and lumbar fusion identified as the most impactful predictors of LOS. Discussion: Temporal modeling with attention mechanisms significantly improves LOS prediction by capturing the sequential nature of patient data. Unlike static models, SurgeryLSTM provides both higher accuracy and greater interpretability, which are critical for clinical adoption. These results highlight the potential of integrating attention-based temporal models into hospital planning workflows. Conclusion: SurgeryLSTM presents an effective and interpretable AI solution for LOS prediction in elective spine surgery. Our findings support the integration of temporal, explainable ML approaches into clinical decision support systems to enhance discharge readiness and individualized patient care.


Enhanced prediction of spine surgery outcomes using advanced machine learning techniques and oversampling methods

Benítez-Andrades, José Alberto, Prada-García, Camino, Ordás-Reyes, Nicolás, Blanco, Marta Esteban, Merayo, Alicia, Serrano-García, Antonio

arXiv.org Artificial Intelligence

The study proposes an advanced machine learning approach to predict spine surgery outcomes by incorporating oversampling techniques and grid search optimization. A variety of models including GaussianNB, ComplementNB, KNN, Decision Tree, and optimized versions with RandomOverSampler and SMOTE were tested on a dataset of 244 patients, which included pre-surgical, psychometric, socioeconomic, and analytical variables. The enhanced KNN models achieved up to 76% accuracy and a 67% F1-score, while grid-search optimization further improved performance. The findings underscore the potential of these advanced techniques to aid healthcare professionals in decision-making, with future research needed to refine these models on larger and more diverse datasets.


A Hybrid-Layered System for Image-Guided Navigation and Robot Assisted Spine Surgery

T, Suhail Ansari, Maik, Vivek, Naheem, Minhas, Ram, Keerthi, Lakshmanan, Manojkumar, Sivaprakasam, Mohanasankar

arXiv.org Artificial Intelligence

In response to the growing demand for precise and affordable solutions for Image-Guided Spine Surgery (IGSS), this paper presents a comprehensive development of a Robot-Assisted and Navigation-Guided IGSS System. The endeavor involves integrating cutting-edge technologies to attain the required surgical precision and limit user radiation exposure, thereby addressing the limitations of manual surgical methods. We propose an IGSS workflow and system architecture employing a hybrid-layered approach, combining modular and integrated system architectures in distinctive layers to develop an affordable system for seamless integration, scalability, and reconfigurability. We developed and integrated the system and extensively tested it on phantoms and cadavers. The proposed system's accuracy using navigation guidance is 1.020 mm, and robot assistance is 1.11 mm on phantoms. Observing a similar performance in cadaveric validation where 84% of screw placements were grade A, 10% were grade B using navigation guidance, 90% were grade A, and 10% were grade B using robot assistance as per the Gertzbein-Robbins scale, proving its efficacy for an IGSS. The evaluated performance is adequate for an IGSS and at par with the existing systems in literature and those commercially available. The user radiation is lower than in the literature, given that the system requires only an average of 3 C-Arm images per pedicle screw placement and verification


Safe Deep RL for Intraoperative Planning of Pedicle Screw Placement

Ao, Yunke, Esfandiari, Hooman, Carrillo, Fabio, As, Yarden, Farshad, Mazda, Grewe, Benjamin F., Krause, Andreas, Fuernstahl, Philipp

arXiv.org Artificial Intelligence

Spinal fusion surgery requires highly accurate implantation of pedicle screw implants, which must be conducted in critical proximity to vital structures with a limited view of anatomy. Robotic surgery systems have been proposed to improve placement accuracy, however, state-of-the-art systems suffer from the limitations of open-loop approaches, as they follow traditional concepts of preoperative planning and intraoperative registration, without real-time recalculation of the surgical plan. In this paper, we propose an intraoperative planning approach for robotic spine surgery that leverages real-time observation for drill path planning based on Safe Deep Reinforcement Learning (DRL). The main contributions of our method are (1) the capability to guarantee safe actions by introducing an uncertainty-aware distance-based safety filter; and (2) the ability to compensate for incomplete intraoperative anatomical information, by encoding a-priori knowledge about anatomical structures with a network pre-trained on high-fidelity anatomical models. Planning quality was assessed by quantitative comparison with the gold standard (GS) drill planning. In experiments with 5 models derived from real magnetic resonance imaging (MRI) data, our approach was capable of achieving 90% bone penetration with respect to the GS while satisfying safety requirements, even under observation and motion uncertainty. To the best of our knowledge, our approach is the first safe DRL approach focusing on orthopedic surgeries.


What's next for augmented reality in spine surgery?

#artificialintelligence

Augmented reality in spine surgery is taking off, and its growth could be exponential in the near future. With more and more places adopting the technology, many surgeons have a positive outlook for the technology. It's the first AR system driven by artificial intelligence, using machine-learning-based guidance and automated surgical planning. In December, Royal Philips expanded its ClarifyEye AR spine system in Spain and Oman. In February, Neo Medical raised $20.6 million in funding, which will support the product's U.S. launch.


Surgalign receives FDA clearance for AI-driven HOLO Portal system for spine surgery - Spinal News International

#artificialintelligence

Surgalign Holdings has announced that it has received US Food and Drug Administration (FDA) 510(k) clearance for its HOLO Portal surgical guidance system for use within lumbar spine procedures. According to Surgalign, the HOLO Portal system is the world's first artificial intelligence (AI)-driven augmented reality (AR) guidance system for spine and the first clinical application of Surgalign's HOLO AI digital health platform. Terry Rich, president and chief executive officer of Surgalign, said: "Receiving the initial clearance for the HOLO Portal system is a significant milestone and represents a critical step toward building the foundation of the digital surgery of the future. This system is designed to improve patient outcomes by delivering intelligent solutions to our customers, and we believe it is truly revolutionary. "With clearance in hand for our guidance application, our near-term focus is getting the platform into the hands of surgeons as we work towards a market release.


AR on the rise in 2021: 13 notes

#artificialintelligence

Augmented reality technology has taken off in spine and orthopedic surgery in 2021. In December, ClarifyEye expanded in Spain and Oman. The hospital is the first in Denver to implement Augmedics' Xvision system. Riverside Healthcare partnered with Brainlab to add its Zeiss Kinevo Microscope to its spine program. The first spine case combining augmented reality and a surgical robot was recently performed by Kornelis Poelstra, MD, PhD, director of The Robotic Spine Institute of Silicon Valley in Los Gatos, Calif., in May.


Fiducial marker recovery and detection from severely truncated data in navigation assisted spine surgery

Fan, Fuxin, Kreher, Björn, Keil, Holger, Maier, Andreas, Huang, Yixing

arXiv.org Artificial Intelligence

Fiducial markers are commonly used in navigation assisted minimally invasive spine surgery (MISS) and they help transfer image coordinates into real world coordinates. In practice, these markers might be located outside the field-of-view (FOV), due to the limited detector sizes of C-arm cone-beam computed tomography (CBCT) systems used in intraoperative surgeries. As a consequence, reconstructed markers in CBCT volumes suffer from artifacts and have distorted shapes, which sets an obstacle for navigation. In this work, we propose two fiducial marker detection methods: direct detection from distorted markers (direct method) and detection after marker recovery (recovery method). For direct detection from distorted markers in reconstructed volumes, an efficient automatic marker detection method using two neural networks and a conventional circle detection algorithm is proposed. For marker recovery, a task-specific learning strategy is proposed to recover markers from severely truncated data. Afterwards, a conventional marker detection algorithm is applied for position detection. The two methods are evaluated on simulated data and real data, both achieving a marker registration error smaller than 0.2 mm. Our experiments demonstrate that the direct method is capable of detecting distorted markers accurately and the recovery method with task-specific learning has high robustness and generalizability on various data sets. In addition, the task-specific learning is able to reconstruct other structures of interest accurately, e.g. ribs for image-guided needle biopsy, from severely truncated data, which empowers CBCT systems with new potential applications.


Fusion Robotics merges with Integrity Implants to improve spinal care

#artificialintelligence

Fusion Robotics develops navigation and robotic targeting solutions for spine surgery. Integrity Implants, a company focused on minimally-invasive spine surgery, and Fusion Robotics, a developer of navigation and robotic targeting solutions for spine surgery, have merged. The combined company will be named "Accelus" and will focus on accelerating the adoption of minimally-invasive surgery for spinal care. Fusion received a 510(k) clearance for its initial product offering and started performing spine procedures in the U.S. market earlier in 2021. Integrity's flagship FlareHawk Lumbar Interbody Fusion Device received FDA clearance in 2016 and CE mark approval in 2021, and to date more than 10,500 FlareHawk devices have been implanted in more than 8,000 patients.